How To Fix Python Error

您所在的位置:网站首页 python3 ascii codec cant How To Fix Python Error

How To Fix Python Error

#How To Fix Python Error| 来源: 网络整理| 查看: 265

This is a very common error

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' Traceback (most recent call last): File "", line 1, in UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in position x Fix – UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xa0′:

Quite common error while dealing with unicode characters if you fetch or crawl data from different web pages (on different sites).

Let’s understand why this problem is happening –

if( aicp_can_see_ads() ) {

}

When you try to use the Python string function, it uses the default character encoding . If you check sys.stdout.encoding value , sometimes it is “None”. The default can be located in – /etc/default/locale in case of Linux And the default is defined by the variables LANG, LC_ALL, LC_CTYPE See what values are set against these variables. For example – If the default is UTF-8 , these would be LANG=”UTF-8″ , LC_ALL=”UTF-8″ , LC_CTYPE=”UTF-8″ Now assume default encoding is “XYZ” . Hence Python tries to encode the bytes (input data\text) using this encoding. Assume some of “these” text\data representations belong to unicode characters. Now if the default character encoding used is not equipped to handle that, the error pops out. So to handle this issue , you have to specify the “RIGHT” encode option to Python so it knows how to handle it. A Standard option is to use “UTF-8” as a encode option. It more or less works fine. There are other ways also to workout\ignore the error. We will see that.

 

The Python string function handles the below set of ASCII characters comfortably –

whitespace = ' \t\n\r\v\f' ascii_lowercase = 'abcdefghijklmnopqrstuvwxyz' ascii_uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' ascii_letters = ascii_lowercase + ascii_uppercase digits = '0123456789' hexdigits = digits + 'abcdef' + 'ABCDEF' octdigits = '01234567' punctuation = r"""!"#$%&'()*+,-./:;[email protected][\]^_`{|}~""" printable = digits + ascii_letters + punctuation + whitespace

 

Fix –

Set the Python encoding to UTF-8. This will ensure the fix for the current session . $ export PYTHONIOENCODING=utf8

 

if( aicp_can_see_ads() ) {

}

Set the environment variables correctly in /etc/default/locale .  This sets the system`s default locale encoding to the UTF-8 format. LANG="UTF-8" or "en_US.UTF-8" LC_ALL="UTF-8" or "en_US.UTF-8" LC_CTYPE="UTF-8" or "en_US.UTF-8" Or use command line export LC_ALL="UTF-8" export LC_ALL="UTF-8" export LC_CTYPE="UTF-8"

 

Set the encoding at code level. str1 = str2 = str1.encode('utf-8') print (str1.encode('utf-8')) print (str2) str1 = str2 = str1.encode('utf-8', 'ignore').decode('utf-8') print (str2)

 

Set the encoding using sys # encoding=utf8 from __future__ import unicode_literals import sys reload(sys) sys.setdefaultencoding('utf8')

 

Set the encoding using locale import os import locale os.environ["PYTHONIOENCODING"] = "utf-8" scriptLocale=locale.setlocale(category=locale.LC_ALL, locale="en_GB.UTF-8")

 

Set the encoding using Emacs #!/usr/bin/env python # -*- coding: utf-8 -*- u = 'abcdé' print(ord(u[-1])) #!/usr/bin/env python # -*- coding: utf-8 -*- #!/usr/bin/env python # coding: utf8

 

If you can safely ignore or bypass or throw out the unicode characters or you do not need those , you can also use below option . In this example , str2 will no longer have any unicode characters (those are ignored or dropped). str2 = str1.encode('ascii', 'ignore').decode('ascii') print (str2)

 

if( aicp_can_see_ads() ) {

}

Use codecs for file operation – codecs.open(encoding=”utf-8″) – File handling (Read and write files to and from Unicode) . The encoding can be anything utf-8, utf-16, utf-32 etc. import codecs opened = codecs.open("inputfile.txt", "r", "utf-8")

 

 

Additional points : In Python 3 as UTF-8 is the default source encoding encode() function converts the Unicode to bytes (returns a bytes representation of the Unicode string). Various encode() options – encode(‘ascii’, ‘ignore’) encode(‘ascii’, ‘replace’) encode(‘ascii’, ‘xmlcharrefreplace’) encode(‘ascii’, ‘backslashreplace’) encode(‘ascii’, ‘namereplace’) decode() function converts the bytes to a String . This method takes an encoding argument, such as UTF-8, and optionally an errors argument. The errors argument (e.g. “ignore”) specifies the response when the string can’t be converted with the encoding.Various decode() options – decode(“utf-8”, “strict”) decode(“utf-8”, “replace”) decode(“utf-8”, “backslashreplace”) decode(“utf-8”, “ignore”) UTF-8 properties – Can handle any Unicode code point. A string of ASCII text is also valid UTF-8 text. UTF-8 is a byte oriented encoding. The encoding specifies that each character is represented by a specific sequence of one or more bytes. This avoids the byte-ordering issues that can occur with integer and word oriented encodings, like UTF-16 and UTF-32, where the sequence of bytes varies depending on the hardware on which the string was encoded.

 

Hope this helps to solve the issue.

 

Other Interesting Reads – How to log an error in Python ? How to Code Custom Exception Handling in Python ? How to Handle Errors and Exceptions in Python ? How to Handle Bad or Corrupt records in Apache Spark ?

 

if( aicp_can_see_ads() ) {

}

‘ascii’ codec can’t encode character u’\xa0′, ascii’ codec can t encode character python3, unicodeencodeerror: ‘ascii’ codec can’t encode characters in position ordinal not in range(128), ascii codec can’t encode character u’ u2019′, ascii character u’ xa0′, unicodeencodeerror: ‘ascii’ codec can t encode character u’u2026, ascii codec can’t encode character u’ u2013′, unicodeencodeerror: ‘ascii’ codec can’t encode character u’\xe9′, ‘ascii’ codec can’t encode character u’\ufeff’, unicodeencodeerror ‘ascii’ codec can’t encode character u’ xe0′, unicodeencodeerror ‘ascii’ codec can’t encode character u’ xe0′ in position, python unicodeencodeerror ‘ascii’ codec can’t encode character u’ xe0′ in position, ascii’ codec can t encode character python3, unicodeencodeerror: ‘ascii’ codec can’t encode characters in position ordinal not in range(128), ascii codec can’t encode character u’ u2019′, ascii character u’ xa0′, unicodeencodeerror: ‘ascii’ codec can t encode character u’u2026, ascii codec can’t encode character u’ u2013′, unicodeencodeerror: ‘ascii’ codec can’t encode character u’\xe9′, ‘ascii’ codec can’t encode character u’\ufeff’, unicodeencodeerror, unicodeencodeerror ‘ascii’ codec can’t, unicodeencodeerror ‘charmap’, unicodeencodeerror python 3, unicodeencodeerror ‘latin-1′, unicodeencodeerror while writing to file, ascii’ codec can’t encode character u’ xe9′, ascii’ codec can’t encode character u’ xa0′, ascii’ codec can’t encode character ‘ u2019′, ascii’ codec can’t encode character ‘ u2013′, ascii’ codec can’t encode character ‘ u201c’, ascii’ codec can’t encode character ‘ ufffd’, ascii’ codec can’t encode character u’ xa3′, ascii’ codec can’t encode character ‘ u2026′, ascii’ codec can’t encode character ‘ ufeff’, ascii’ codec can’t encode character u’ xa0′, ascii’ codec can’t encode character u’ u2013′ in position 11 ordinal not in range(128), ascii’ codec can’t encode character u’ xe9′, ascii’ codec can’t encode character u’ u2019′, ascii’ codec can’t encode character u’ u201c’, ascii’ codec can’t encode character u’ ufffd’, ascii’ codec can’t encode character u’ u2026′, ascii’ codec can’t encode character u’ xe4′, ascii’ codec can’t encode character u’ u2013′ in position 33 ordinal not in range(128), ascii’ codec can’t encode character u’ u201c’ in position, ascii’ codec can’t encode character, ascii’ codec can’t encode character u’ xe9′, ascii’ codec can’t encode character ‘ u2019’ ascii’ codec can’t encode character u’ u2013′, ascii’ codec can’t encode character ‘ u201c’, ascii’ codec can’t encode characters in position 0-5 ascii’ codec can’t encode character u’ ufffd’, ascii’ codec can’t encode character python3, ascii’ codec can’t encode character u’ u2026′, ascii’ codec can’t encode characters in position 0-3, ascii’ codec can’t encode character, ascii’ codec can’t decode byte, ascii’ codec can’t encode character u’ xe9′, ascii’ codec can’t encode character ‘ u2019′, ascii’ codec can’t decode byte 0xe2, ascii’ codec can’t encode character u’ u2013′, python ord, python encoding types, python ascii, python t, python string to hex, python print, python unicode to utf8, python string replace, python unicode to ascii, python write to file fix unicode error python, What is a Unicode error in Python, How do I get Unicode in Python,What is a Unicode decode error,Does Python support Unicode,

if( aicp_can_see_ads() ) {

}

 



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3